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ROC AND THE BOUNDS ON TAIL PROBABILITIES VIA 
THEOREMS OF DUBINS AND F. RIESZ 

By Eric Clarkson/ J. L. Denny and Larry Shepp^ 

University of Arizona, University of Arizona and Rutgers University 

For independent X and Y in the inequality P{X <Y + fj,), we 
give sharp lower bounds for unimodal distributions having finite vari- 
ance, and sharp upper bounds assuming symmetric densities bounded 
by a finite constant. The lower bounds depend on a result of Dubins 
about extreme points and the upper bounds depend on a symmetric 
rearrangement theorem of F. Riesz. The inequality was motivated by 
medical imaging: find bounds on the area under the Receiver Oper- 
ating Characteristic curve (ROC). 

1. Introduction. We give sharp upper and lower bounds on 

P{X<Y + fi), 

where the independent variables X and Y have zero means and satisfy ei- 
ther unimodality or symmetry conditions. The lower bounds assume uni- 
modality and use a theorem of Dubins [5] about extreme points, while the 
upper bounds assume symmetry and use a theorem of F. Riesz [13] about 
symmetric rearrangements. We emphasize that our basic inequalities in the 
lower-bound case are known, proved earlier by various authors starting with 
Gauss. Our justification for proving the known theorems is mainly to show 
that the bounds are sharp and perhaps to indicate another approach. 

Both bounds were motivated by a widely used methodology in medi- 
cal imaging, the ROC (Receiver Operating Characteristic) curve, known to 
statisticians as a power function (not necessarily of the most powerful test). 
A widely used interpretation of the ROC curve is the AUC (Area under the 
Curve), defined next. 
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Thus, we are given variables Xi, i = 0, 1 having continuous Fi{x) = P{Xi < 
x). Letting for < a < l,x{a) = sup{x : 1 — Fo(x) > q}, the ROC curve is 

a I — > x{a), 

the AUG being 

(1) AUC{Xo,Xi)= [\l - Fi{x{a))) da. 

Jo 

If the variables Xi are independent, then (1) equals 

(2) P{Xo<Xi), 

an identity first proved by Bamber [1], at least in the medical imaging liter- 
ature. 

This is a role of the AUG. An experimenter wants to compare two medi- 
cal imaging modalities to decide which best detects a tumor. For example, 
one may compare X-rays against MRI images, although often one compares 
"filters" for the same modality [other modalities include positron emission 
tomography (PET), single-photon computed emission tomography (SPECT) 
and ultrasound]. Imagine that a large number of experimenters each test 
the hypothesis of Fq (no tumor) against Fi (tumor) and that each chooses a 
level of significance a according to a uniform distribution. (The use of a ran- 
dom a reflects the differing levels of significance of different experimenters.) 
For each experimenter, the hypothesis Fq is rejected in favor of Fi if a scalar 
observable exceeds a constant x{a). Thus the AUG gives the average of the 
power function 1 — Fi{x{a)). 

The equality of (2) and (1) for continuous distributions leads to a second 
widely used method, the 2AFG (Two- Alternative Forced Ghoice). In this 
case the experimenter is confronted with two choices, perhaps two different 
imaging modalities, perhaps a "signal" or "no signal." The experimenter 
uses a test statistic, rejecting the hypothesis of no signal in favor of a signal 
if the test statistic is large. More precisely, the experimenter computes the 
test statistic, applies it to the two data sets, the signal and the nonsignal 
(not knowing which is which), and chooses the data set giving the larger 
value of the statistic as the signal. In (2), the distribution of Xi is that of 
the statistic when the signal is present, Xq when the signal is absent. 

The ROG was developed during World War II for analyzing the perfor- 
mance of radar systems. Today ROG analysis is regularly used in the health 
care industry and by the Federal Drug Administration to evaluate new imag- 
ing systems, diagnostic tests, treatments and pharmaceuticals. Often, if not 
invariably, a Gaussian assumption is made on a test statistic, typically the 
log-likelihood ratio. The bounds obtained here are of course weaker. For ex- 
ample, without assuming a Gaussian distribution or the equality of modes 
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(but assuming fix = fJ-Y, crx = = 1), the unimodality lower bound when 
^ = 2^/6^4.899 is 

P{X <Y + 2V6)>0.5, 

while a Gaussian distribution gives a lower bound greater than 0.99966 (since 
-v/T2 = 3.4641); claimed differences may be the result of the Gaussian as- 
sumption. Some sort of compromise is needed. 

ROC analysis in medical imaging has an enormous literature. We mention 
the book of Swets and Pickett [14] and the papers of Metz [11], Clarkson [3] 
and Barrett, Abbey and Clarkson [2]. We also mention that researchers in 
medicine and in psychophysics — the branch of psychology that studies the 
relations between physical stimuli and sensory response — use a functional 
relation between the AUG and the SNR (Signal-to-Noise Ratio), where in 
medical imaging, the SNR is defined as the ratio of the mean pixel value to 
their standard deviation. 

In this paper we study the behavior of (2) under unimodality or symmetry 
assumptions. Lower and upper confidence bounds on a translation parameter 
fj, defined below are clearly available although not discussed here. Because (2) 
and (1) are equal for continuous distributions, we state our results in terms 
of (2). For the lower bounds in the unimodal case we constrain the variances 
to equal 1. For the upper bounds in the symmetric case we constrain the 
densities to be bounded by b < oo. Although the second constraint may 
appear unfamiliar (if not unnatural) , it is easy to see that neither constraint 
is relevant to the other case. 



2. Dubins's theorem — lower bounds for P{X < Y -\- fi). Throughout 
this section we assume that Xq has a mode equal to zero. Rather than 
assuming that Xi has a mode equal to /i > 0, we find it convenient to assume 
that Xi has a mode equal to zero and then study Xi + /x. 

We begin with sharp upper bounds for symmetric unimodal distributions 
and we recall the definition of unimodal distributions (page 155 of Feller 
[6]): a distribution function F is unimodal at m if F is convex on (— oo,m) 
and concave on (m, oo). Note that F may assign positive mass to the point 
m and that a mode need not be unique. 

Gauss (see Pukelsheim [12]) proved the following inequality for a variable 
X having a continuous unimodal distribution, where = E[{X — m)^]: 

n\X m|>sj<|^^2^g^2^ s>2V3t. 

Pukelsheim [12] gives a useful survey of the inequalities which followed 
that of Gauss. The method of extreme points was used earlier by Dharmad- 
hikari and Joag-Dev [4]. 

We need a special case of a theorem of L. Dubins [5]. 
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Theorem 1 (Dubins). Let A be a compact convex subset of a locally 
convex space and let T be a real continuous linear functional on A. Then 
each extreme point of Ari{x: T{x) = y} is a convex combination of at most 
two extreme points of A. 

The inequality in the next lemma and its corollary are known, mentioned 
in the references above. What is new (we believe) is a sharp one-sided in- 
equality. Dubins's theorem leads naturally to the bounds and the distribu- 
tions achieving the bounds. 

In order to use a compactness argument on the space of distributions, we 
initially assume that all distributions are supported on the interval [— A'^, A'^]. 
Since N is arbitrary, it is easy to verify that all assertions will extend to the 
case that the distributions are supported on (—00,00). 

Lemma 2. Fix t > and assume that t < 2N/3. If X has a symmetric 
unimodal distribution supported on [—N,N] and var{X) = 1, then 



Proof. We fix > 2, let denote the set of symmetric unimodal dis- 
tributions on [— iV, iV], and note that JP" is a compact convex set in the 
weak topology. It is easy to see that the extreme points of J- are the Dirac 
probability Sq and the boxcar densities fa{x), where for < a < iV, 





x\ < a, 
x\ > a. 



Define a continuous linear functional T on JT by 




N 



T{F) 
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and let C J- denote the compact convex set where TiF') = 1, that is, 
the variance equals 1. Dubins's theorem implies that the extreme points of 
^(1) are the distributions of the form 

(3) {l-P)6o + (3F2, 

(4) {l-(3)Fi+pF2, 

where the Fi are distribution functions having boxcar densities and < (3 < 
1, and the variance of (3) and of (4) equals 1. Fix <t < 2N/3 and define 
the continuous linear functional on J-{1), 

Tt{F)^ dF{x). 

Because the maximum of Tt{F) on is obtained at an extreme point, 

to prove the theorem it suffices to calculate the values of Ti(F) on (3) and 
(4). □ 

Now let X have a unimodal distribution function H with mode at the 
origin, var(X) = 1, mean fix, and let denote the variable with the sym- 
metric unimodal distribution function l/2{H{x) + 1 — H{—x)). Then X" has 
variance 1 + . 

Corollary 3. For t>0, 

Pi\X\ > t(l + = 2PiX' > t(l + /i^^) 1/2) 



< 



1 4/(9*2), t>2/V3. 



Theorem 4. Let X have a unimodal distribution with a mode at the 
origin, var{X) = 1, and mean fj.x- Then for t > 0, 

Y ^ / 1 - */(2V3), 0<t< 4/^3 ^ 2.3094, 
^^^>'^-\4(l + /.^), (9*2)4/^3 <t. 

Ift< 4/-v/3, then the bound is obtained at the density 

(5) /(x) = 1/(2^3), 0<x<2V3. 

Fix t > 4/\/3. The bound is obtained at the distribution 

f. 4 \ , , 8(n2-l)V2 



where 

3u2 
2(^2 - 1)1/2 



A{u) 



1 

~ (u2 - 1)V2 ' 
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and u satisfies: t is the positive root of 

Remark 5. Always u > 2/\/3 and when u = 2/\/3 the positive root 
t = 4/\/3. If t is large so that u is also large, then (6) is approximately the 
polynomial in the variable t, 

o 3ut^ 1 o 

whose unique root t satisfies t = u. Let t be given and let u{t) be such that 
t is the root of (6) with u{t) the constant. For example, if i = 3.18198, then 
u{t) = 3.0, and if t = 8.063242, then u(t) = 8.0. It is not difficult to verify 
that t > u{t) and lim.t/t{u) = 1 as u ^ oo. 

Proof of Theorem 4. To find the upper bound, we claim that we 
may assume that unimodal X >0. For if X has a unimodal distribution 
with mode at the origin, var{X) = 1, we may define a unimodal S >0 with 
mode at the origin, var{S) = 1, so that for all t > 0, 

P{S>t)>P{X>t). 

If X < choose, say, S = —X. Otherwise, take that part of the distribution 
of X which is supported on (— oo,0) and put it on {0}; call the new variable 
Y. Because 

E(Y'^) < E{X'^), 
E{Y)>E{X), 

we have 

< 7^ EE Var{Y) < 1. 

Then S = Y/^ > is unimodal with mode at the origin, var{S) = 1, and for 
t>0, 

P{S >t) = P{X > -ft) > P{X > t). 
Continuing the proof, from Corollary 3 and Lemma 2, 

P{X >t)= 2P{X' >t) = 2P{Xy{l + > t/{l + /u^)^/^) 

(7) 

^ r 1 - t/{V3{l + /z2,)i/2), < t < 2(1 + M^)i/VV3, 
-14(1+ f,'^)/{9t^), 2(1 + fiD^/yv^ < t. 
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If X has the density (5), then fix = and X satisfies (7) for < t < 4/ \/3. 
For u > 2/^/3 let X have the distribution 

4 \ ^ ^ 8(^2-1)1/2 

~ 3^ J ^ 9^ ^(0,3«2/(2(«2-l)i/2). 

Then ax = 1 and since fix = - 1 + fij^ = u^jiu^ - 1). For t > 

(2/V3)(^xV(^'-l))'/', 

, , 4 / 2f(n2- 1)1/2- 

Now fix t > 4/\/3- We want a value of u so that (8) equals 

^ 4^2 ( 4(l+^\ 

9t2(n2-l) V 9*2/ 

This is satisfied by (6). □ 

Following Ibragimov [8], a unimodal distribution function is strong uni- 
modal if its composition with any unimodal distribution function is uni- 
modal. Ibragimov proved that a distribution function F is strong unimodal 
if and only if F is continuous unimodal and its density / satisfies 

is a concave function on the interior of the support of F. 

Let independent X and Y have unimodal distributions with modes (not 
necessarily unique) mx and my , means fix = = and standard devia- 
tions ax = cy = 1. We assume that at least one of the unimodal distributions 
is strong unimodal and recall from [9] that a mode of {X — Y)/ y/2 satisfies 

(9) \^(X-~Y)/V2\^^- 

Corollary 6. Fix fi > >/6. Then 
P{X<Y + fi) 



(10) 



> 



(/X- V6)/(2^/6), V6</i< V6 + 4^2/3 = 5.7155, 

1 -32(3(/i- \^))"^ y/Q + 4:y/2j2, < fi. 



Remark 7. The only case of interest, P{X <Y + fi)> 0.5, requires 
/i> 2^/6^4.899. 

Proof of Corollary 6. If we define 

Z = ^{X-Y -mx-Y). 
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then Z is unimodal with a mode at the origin, mean equal to (— mx-y)/\/2 
and variance equal to 1. The left-hand side of (10) is 

(11) P(Z<^(/i-mx-r)). 

Thus among all such unimodal variables (11) is minimized with mx-Y / \/2 - 
■v/3, using (9). To complete the proof it suffices to find a least upper bound 
for 

P{Z>j^{li-V^)), /x>V6 

using Theorem 4. □ 

3. F. Rieszs theorem — upper bounds for P{X <.Y -\- Recall the 
definition of the symmetric rearrangement of the indicator function of a 
Borel set ([10]; see also [7]). If A C -R is a Borel set of finite Lebesgue measure 
A (A), then the symmetric rearrangement of the set A, denoted by A* , is the 
symmetric open interval so that \{A*) = \{A). We let the functions 

I A, I A 

denote the indicator functions of A and A*, respectively. The following is 
a special case of Riesz's theorem [13]. As in the previous section, we study 
distributions whose support is [— A^, A^], arbitrary A. 

Theorem 8 (F. Riesz). Let A,B and C be Borel sets of finite measure. 
Then 

00 rco roo roo 

I lA{x)lB{x-y)Ic{y)dxdy< I I I\{x)rB{x - y)Ic{y)dxdy. 

'CO J —CO J —GO J —GO 

Fix 6 > and A > 1/(26), and let J-{b) denote the class of all symmetric 
distributions on [—A, A] whose distribution functions satisfy a Lipschitz 
condition with Lipschitz constant b. Clearly J-(b) is convex and the Lipschitz 
condition ensures that J^{b) is an equicontinuous class; because T{b) is closed 
(the sup norm topology), Ascoli's theorem implies that J-^{b) is compact. 
We let distributions F,G £ J'{b) and let H G J'{b) be the distribution with 
density 

^ ' ^ ^ \0, jxj > 1/(26). 

With these distributions we associate independent variables 



(13) 



A~F, y~G; Uo,Ui^H. 
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Theorem 9. Fix fi>0. If bfj.<l, then 

p{x < y + < p{Uq <Ui + ^l) = h|I + \{l- {b^f). 

The inequalities are strict unless f = u a.e. If bfi>l, then 
P{X <Y + ix)< P{Uo < C/i + /i) = 1. 

The proof of the theorem rests on the following lemmas. We thank the 
referee for observing that the argument of the next lemma extends to an 
arbitrary probability space without symmetry conditions. 



Lemma 10. Assume that N > 1/(26). The extreme points of J-{b) are 
the distributions with the densities (up to a set of measure zero) 

b, xGB, 
0, x^B, 

where symmetric B C [— A^, A^] and the Lebesgue measure \{B) = 1/5. 



(14) blB{x) = !^\ 



Proof. Given G, let g be the density of G. Suppose that g does not 
satisfy (14) up to a set of measure zero: for an e > there is a symmetric 
set A 

A = {x:e < g{x) <b — e} 

so that A(^) > 0. Choose disjoint symmetric subsets of ^o, C A, X{Ai) > 
0, and constants (5o, 5i > so that 

5oA(^o) = '^iA(^i), 

and so that ior x £ AqU Ai,i = 0, l(mod2), 

g{x)-5i>0, g{x)+5i+i<b. 

Define for i = 0, l(mod2), 

(g{x), x£{AoUAiY, 
g,{x) = I g{x) + {-iy+^So, x e Ao, 
[g{x) + {-iy6i, xeAi. 

Then G is not extremal: letting Gj be the distribution functions of the gj, 
we have distinct Gi G ^{b) and G = ^Gi + \G2- □ 

Fix n > and define the continuous bilinear functional on the convex 
compact T{b) x J^{b), 

T{F,G)= r dF{x)(r dG{y)] 

J-N \J{x-ii)V{-N) J 

= P{X<Y + fi). 
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Lemma 11. max{T{F,G):G,H & J^{b)} is obtained at extreme points. 

Proof. Recall that a continuous linear functional on a compact con- 
vex set obtains its maximum at an extreme point. Letting (Fn,Gn) sat- 
isfy T{Fn,Fn) t sup{T(F, G)}, one uses a compactness argument on subse- 
quences to verify the lemma. □ 

The proof of the next lemma follows from Lemma 10 and the definition 
of symmetric rearrangement. 

Lemma 12. If F £ J^{b) is an extreme point with density f, then the 
symmetric rearrangement f* =u [see {12)]. 

Lemma 13. Fix fi>0. Let Yq,Yi have densities /o,/i whose distribu- 
tions are extreme points of J-{b) . Then 

P{\Yo-Yi\<^i)<P{\Uo-Ui\<^i). 

Proof. Note that I(-^^y) =I*^_^^y Using the theorem of Riesz, the 
symmetry, and the preceding lemma, 

P{\Yo-Yi\<fi)= f f Ii^.^,^){x)fo{x-y)f,{y)dydx 

•J Ft 'J Ft 



< / / I{~f,,f,){x)foix -y)fiiy)dydx 

•J Ft. Ft. 

/(_^,^) {x)u{x - y)u{y) dy dx 

IRJR 

= P{\Uo-Ui\<fi). □ 

Proof of Theorem 9. This follows from the lemmas and the symme- 
try assumption used in 

P{Yo<Yi + fx) = ^{l + P{\Yi-Yo\<fi)). □ 

Acknowledgments. Anirban Dasgupta and Harry Barrett contributed to 
this paper. 
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